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Abstract; In a method for transmission of video 
infonnation between HTTP servers and clients in 
a shar^ network resource, particularly Internet, the 
video information is stored as a video file consisting 
of packet-divided video streams compression-coded 
with average bit rates t(c) which cover the clients* 
expected channel bit rates o. Each packet and the 
yi^ file are supplied with a header containing 
information for realizing a bandwidth-scalable 
vidTO transmission over a suitable version of HTTP. 
During transmission switching between the video 
sbe^ takes place on the basis pf an estimation 
of me channel bit rate a and the information in the 
packet header, such that the bit rate t(c) is adapted 
to the client's actual channel bit rate o. In a method 
fOT client-executed search and retrieval of video 
infonimtion in a shared network resource, particularly 
searching of a video frame Fx in a packet-divided 
video stream, the packets in a video stream are 
divipacket p^ket in each group. On basis of given 
search criteria and search-specific infonnation in 
^ packet headers the packet with the video frame 
Fx IS found, such that a pseudo-random search and 
realized with the use of a suitable version 
of HTTP as transport protocol. 
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Methods in transmission and searching of video information. 

The invention concerns a method in transmission on request of video 
information in a shared network resource, wherein the shared network 
resource particularly is Internet, an intranet or extranet, where in the video 
information is stored in the form of an encoded video file on HTTP servers in 
the shared network resource and accessed by clients via HTTP (Hypertext 
Transfer Protocol), wherein each client has a video decoder, and the initial 
video information is in the form of digitized video signals. 

The invention also concerns a method for client-executed search and retrieval 
of video information in a shared network resource, particularly search and 
retrieval of a desired frame in a video stream, wherein the shared network 
resource particularly is Internet, an intranet or extranet, wherein the video 
information is stored in form of an encoded video file on HTTP servers in the 
shared network resource and accessed by clients via HTTP (Hypertext 
Transfer Protocol), wherein each client has a video decoder, wherein the 
encoded video file is concatenated of multiple encoded video streams which 
contain the video signals of the video information compressed at an average 
rate t[c] which covers the client’s expected channel bit rate o^, wherein 
each encoded video stream is divided into p packets with varying lengths 
wherein each packet comprises a header and payload, wherein the packets in 
a stream are provided in non-overlapping, successive groups of two or more 
successive packets, such that each stream is divided in m groups of this kind, 
wherein the header of the first packet of each group in addition to 
information of the number n of video frames which the packet contains and 
references to other packets and streams, is provided with information of a 
jump offset dj which corresponds to the combined lengths of the packets in 
the group and numbery frames which the jump offset dj and the first 
following packet in the following group comprise, wherein the video file 
further comprises a header which contains information of the parameters of 
the streams, wherein the information of the parameters of the streams 
includes the distances <4 and di from the beginning of the video file to 
respectively the begiiming of each stream and to the end of the first packet 
in each stream, and wherein the transmission of video information takes 
place bandwidth scalable over a version of HTTP which allows persistent 
coimection and specification of byte range. 
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In a shared network resource the separate resources will have a varying 
quality and varying operative parameters such that the shared network 
resource appears as a heterogeneous communication network without a 
guaranteed service quality. Even though the invention generally concerns 
services in shared network resources, the discussion in the following will 
he directed towards Internet which is the best known and most 
widespread instance of a publicly available shared network resource. As 
well-known the bandwidth of the network connections of Internet is very 
variable. Typically connection bandwidth may vary from 20-500 kilobits/s. 
As the service quality on Internet cannot be guaranteed, the bandwidth and 
packet delay for a given connection may fluctuate due to network congestion. 
This is a serious obstacle for transmitting bandwidth-intensive and 
time-sensitive data as video information over Internet. 

A video signal must be compressed in order to reduce the necessary 
bandwidth for transmission over Internet. For transmission of the request the 
signal is compressed once with an average target bit rate. When lossy 
compression is employed, a distortion is introduced in the decompressed 
signal. The quality of the decompressed signal is proportional with the target 
bit rate. The inherent heterogeneity of the Internet poses a dilemma when the 
target bit rates shall be determined. On one hand the target bit rate should be 
so high that clients with large target bandwidth receive a high-quality signal, 
but then clients with small bandwidth will not receive the same signal in real 
time. On the other hand the target bit rate should be so low that clients with 
small bandwidth receive the signal in real time, but then clients with large 
bandwidth will receive a low-quality signal. The solution to this is to use 
bandwidth-scalable compression. Bandwidth-scalable compression means 
that the number of subsets with different average target bit rates and 
corresponding quality can be extracted from the coded signal. When the 
compressed signal is transmitted to the client, the signal is hence adapted to 
the client’s available channel bandwidth. 

Now the prior art shall be discussed. Existing bandwidth-scalable video 
stream architecture for video streams on demand requires a dedicated video 
server (see J. Hunter, V . Witana, M. Antoniades, ”A Review of Video 
Streaming over the Internet”, DSTC Technical Report TR97-10, August 
1997). The client connects to the video server and the server performs 
bandwidth scaling according to one method or other. The most usual method 
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IS encoding several streams with different average bit rates on one file The 
server then swfiches between the streams dependent on the channel bit rate of 
tbe clients. This solution has two disadvantages. The first one is that a 
dedicated video server is necessary to deliver bandwidth scalable video 
streams on demand and the second one is that firewalls between the video 

server and the clients must be configured particularly such that the video 
Streams are allowed to pass through. 

are presently also known video stream architectures, which apply 
HTTP as transport protocol. Such architectures are implemented as follows 
client requests a video file. When the HTTP server receives the request it 
^to transmit the video file in a HTTP response to the client. By using 
HTTP as transport protocol it is not necessary with any dedicated video 

transmitted 

mtb HTTP will not normally be blocked by firewalls and hence web 
rowsing will be possible, which increases the number of clients, which are 
ab^o receive the stream. The existing video stream architectures based on 
HTTP have two disadvantages (see RealNetworks Inc., "Delivering 
RealAudio or RealVideo fi-om a Web Server”. RealNetworks Technical 
Bluepnnt Senes, 1998 ). The first one is that they are not bandwidth-scalable 

and the second one is that it is not possible to perform a random search in the 
Video stream. 

A suitable version of HTTP as version 1.1 has two interesting properties 
namefy persistent connection and specification of byte range (see R. Fielding, 

- H^/’i I ” "Hypertext Transfer Protocol 

HTIP/1.1 , RFC 2068, UC Irvine, DEC, and MIT/LCS, Januaiy 1997) 

Persistent connection means that several HTTP requests can be transmitted 

over a so-called socket connection, i.e. an identifier for a particular service 

each HTO request to the server. Byte range specification opens for the 
^ssibihty to request a subset of the file on the HTTP server. A timer closes 
the connection if the request is not received by the HTTP server within a 

requests during one connection, e.g. 100. These two properties of e g HTTP 

version 1.1 are the basis ofthe present invention. * e.g. HTTP 
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In order to overcome the disadvantages of the prior art it is a first object of 
the present invention hence to realize a bandwidth-scalable video 
transmission with the use of a suitable version of HTTP as transport protocol 
and further another object to enable pseudoreindom searching of video 
5 information with the use of a suitable version of HTTP as transport protocol. 

The above-mentioned first object and other features and advantages are 
achieved according to the invention with a method which is characterized by 
generating by means of a video encoder y multiple encoded video streams 
which each comprises the video signals of the initial video information 
10 compressed with average bit rates t[c] which covers the clients’ expected 
channel bit rates o, the video encoder generating independently decodable 
video frames at given time intervals; generating y encoded intermediate 
streams from the corresponding encoded video streams by dividing an 
encoded video stream in p packets with varying lengths q, each packet 
1 5 comprising a header and a payload which contains the encoded video signals 

for a time segment corresponding to the payload; providing the header with 
the following information: (i) the distances di and respectively from the 
beginning and to the end of the nearest following packet, (ii) the number n of 
video frames that the packet comprises, and (iii) a reference to the 
20 corresponding packet in respectively the encoded intermediate stream with a 
closest lower average bit rate t[cjc.i] and the coded intermediate stream with a 
closest higher average bit rate t[ct+i], tfcj being the bit rate for the present 
intermediate stream and k, b, a e y; providing an independently decodable 
video frame at the beginning of the payload of a packet; concatenating the 
25 intermediate streams into a final file which is stored on one or more HTTP 

servers; and providing the final file with a header which contains information 
about the parameters of the streams; and further by the following steps 
effected by the client: generating a request for the header and the beginning 
of the first stream in the final file; estimating the channel bit rate a and 
30 selecting the stream whose bit rate t[cij is the closest bit rate relative to the 

estimated channel bit rate tr as the initial stream of the transmission, such 
that t[C|J « a, and then estimating the channel bit rate a during the 

transmission and, if the estimate cr is lower than the average bit rate t[cj of 
the current stream, switching to the stream with the closest lower average bi t 

35 rate t[cj or, if the estimate a is higher than the average bit rate t[cjcj of the 
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ciment stream, switching to the stream with the closest higher average bit 
rate the switching of the streams taking place on the basis of the 

packet references and realizing a bandwidth-scaleabie video transmission 

In an embodiment of the method for transmission according to the invention 

the channel bit rate a is either estimated on the basis of an estimator a- x/r, 

wherein i is the estimated channel bit rate and x the number of buffered bitl 

in the time interval x, such that if .[cj and a buffer length less than twice 
the mmunum packet length, switching takes place to the stream with a closest 

higher target bit rate t[cj. or, if ^ < tfc J and the buffer length less than 
minimum packet length, switching takes place to the stream^th a cloZt 

lower target bit rate tfcs], or on the basis of an estimator , such that 

if IT a 0, switchmg takes place to the stream with a closest higher target bit 

rate t[cj, or if o- < 0. switching takes place to the stream with a closest 

lower target bit rate t[csj. In the last case a boundary value Ao- is determined 

such Aat switching takes place if | o U | Ao I .Alternatively can according 

he h ‘“''Th™ «P®*‘a<lIy integrating 

fte channel bit rate <r over succeeding time intervals x = t. - ts, t. and ts 

bemg respectively an upper and lower boundary for the interval x and an 
integration result Z is given by Z = )o dt. and comparing the integration 

resulte Z„ Z. for respective succeeding time intervals x,, x„ such that if 

hv ' f ' o : Wgher target 

1 rate t[cj, or, if Zj - Z, < 0, switching takes place to stream with a closest 

sura^ P«fe«Wy being determined 

such that switching takes place if I - S, U ( AS | . 

In an advantageous embodiment of the method a transmission according to 
the mvennon a HTTP server adapted to HTTP version U is used and the 
video transmission then takes place over the same version of HTTP 
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In another advantageous embodiment of the method for transmission 
according to the invention the packets are provided in the stre ams in 
non-overlapping successive groups of two or more successive packets, such 
that each stream is divided into m groups of this kind, and the header of the 
first packet in a group is provided with information of a jump offset dj which 
corresponds to the combined lengths of the packets in the group and the 
numbery of frames which is covered by the jump offset dj and the first packet 
in the following group. 

In a third advantageous embodiment of the method according to the invention 
the streams are provided at random in the final file, and the stream with the 
lowest bit rate tfcj refers only to packets of the stream with the closest 
higher bit rate and the stream with the highest bit rate tfcij refers only to 
packets in the stream with the closest lower bit rate. 

In a fourth advantageous embodiment of the method according to the 
invention the streams are provided successively with the increasing bit rate 
t[c] in the final file, such that the stream with the lowest bit rate t[cL] is the 
first video stream of the file and the stream with the highest bit rate tfcn] is 
the last video stream of the file. 

Preferably the streams contain an independently decodable video frame at 
positions corresponding to the beginning of each packet in the current 
stream. 

Preferably the decoded streams contain the same number of video frames. 

If the streams have different frame rates, it is according to the invention 
advantageous defining an adjustment frame as a particular frame type and 
adjusting the frame rates of each stream to the same rate by inserting a 
suitable number of adjustment frames in the respective streams. 

Preferably is the packet length q at most equal to a configurable request 
time-out interval for a HTTP server and preferably the encoded streams are 
processed in parallel frame by frame. 

Finally, it is in the method for transmission according to the invention 
advantageous that the header of the final file comprises information of the 
numbery streams of the file, the average bit rate t[c] of each stream and the 
distances dk and di from the beginning of the final file to respectively the 
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aTr ““ ** ““ stream 

d additionally that the client on the basis of the parameters of the video 

yearns generates subsets of the streams, the switching between the streams 

only taking place in these subsets. = streams 

The above-mentioned second object is achieved with a method for 
client-executed search and retrieval of video information in a shared network 

Characterised T 

Zt oZi ’ ‘ “own'cading the header of the 

desired f ” ^ “ a video stream, comparing the number x of the 

the d continuing the transmission and decoding of 

Video stream from and including the first frame of the packet Jh«eto the 

un^I tofd ^ r ^“*'°*“* sroup. and, as is the case, continuing the process 
decoding *f r** "'•’oreafter downloading and 

the plkeAvhere to'T including the first frame to 

r “ fr“»c-based and 

packet-formatted search and retrieval of video information are realized with 

the use of a suitable version of HTTP as a transport protocol. 

In the method for client-executed search and retrieval of information 
advantageously a HTTP server adapted to HTTP version 1.1 and HTTP 
version 1.1 is used as a transport protocol. 

Both in the method for transmission and to the method for client-executed 
s^ch and retrieval of information it is advantageous generating and 

stack list at transmission and removed therefrom after processing the 
response received from the HTTP server, and retransmitting the requests in 
file stack list at reestablished connection after a possible diLnnection 
beween the HTTP server and client during the reception of the response In 

of the packet being 

J for the possibly already received data. 

The invention shall now be expiained to more detail by means of exemnlarv 
embodiments and with reference to the appended drawtog figures, 
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fig. 1 shows a schematic overview of the generation of the video file as well 
as the server-client-system applied in the present invention, 

fig. 2a the structure of a video file as used in the present invention, 

fig. 2b the header of the video file in fig. 2a, 

5 fig. 3a the packet division of a video stream as used in the video file in fig. 
2a, 

fig. 3b the composition of a packet in the video stream in fig. 3a, 
fig. 3c the header of the packet in fig. 3b, 

fig. 4 schematically the principle for achieving bandwidth scaling when 
10 switching between video streams, 

fig. 5a how a video stream is divided in groups of packets, 

fig. 5b the structure of the separate group in the video stream in fig, 5a, 

fig. 5c the composition of the packets in the video streams in fig. 5b, 

fig. 5d the header of the first packet in a group in the video stream in fig. Sa, 

15 fig. 6a a buffer memory provided before the decoder of the client, and 

fig. 6b the flow diagram of a preferred embodiment of the pseudorandom 
switching of video information transmitted with the method for transmission 
according to the invention. 

In the following detailed description of an embodiment of the present 
20 invention it is understood that if nothing is expressly stated to the contrary, 
the shared network resource is represented by Internet and that HTTP version 
1.1 is used as a transport protocol. Particularly the present invention achieves 
bandwidth scalable video transmission with the use of HTTP version 1 . 1 as a 
transport protocol by generating a bit stream which is optimized for this 
25 purpose. It is particularly suitable using HTTP version 1.1 as transport 

protocol and this implies that an encoder and a HTTP client must be used • 
which makes bandwidth-scalable video transmission with use of HTTP 1.1 
possible with a minimum of overhead. In addition the HTTP client will be 
able to perform fast search in the compressed video information outside the 
30 range already received. 
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present invention shall also enable video providers to offer 
1 I Th Video from a server which is adapted to HTTP version 

cUenl^d th range of the 

5 arlh T ' «arohabie video 

archives wherem the search is performed on groups of packet in a video 

vidri^Ie!!^ tlese? ““ °‘’‘*®*.**'** “> <“«■«»« Positions in the 

file Also * it *11 ^ provided in a header of the compressed video 

me. Also, It will no longer be necessaiy with a dedicated video server 

10 *’"T‘ «*ro»® file format is constructed which 

of Hto' P°!^“*T oonnectioD which is offered by versions 

Fve ‘fi** applies to HTTP-NG versions such as version 1 1 

Even more particular for these versions is that they are not disconnectra^^ 

snfir“"‘r T oonnection over a 
15 m!ne r“a " Simultaneously it is also possible for the client 

15 to specify a determined byte range to be transmitted or the client 

Sc"£s~ 

^enerates y multiple encoded Sit 7e“ 

st thattr '"’“*8' rates tfc] 

the same Video information. sra:resSSrcotS:r 

bLdtdth ’ 1 1 possible with a 

necessan. thSt r*" ™ 20-500 Kb/s. It is 

frames mw " " generates independently decodable video 

fr^es (IF) at given time intervals. Such video frames IF ate known as 

t:;;trf r; t ““<'*<■ t?at vs 

ed into j; encoded intermediate stre ams IS e.c. bv mennc nf 

packets PC. These packets PC may have a vao-ing lengths g and each 
IZluV S’*" •>'0 'oooded video 

streams IS now encoded and fonnaUed are then concatenated 11 tteo file 
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O for storage on the HTTP server, particularly a server for HTTP version 1.1 
and can on request from an HTTP client be down-loaded thereto. As shown 
in fig. 1, the down-loaded video file is buffered on a cache memory of 
suitable size in the HTTP client and is then decoded by the client’s decoder, 
the decoding, of course, taking place with an average bit rate t[c] which 
corresponds to the encoding bit rate for the ciurent stream. After the 
decoding a digital/analog conversion takes place in the client’s digital/analog 
converter DAC, whereafter the received video information can be stored on 
an analog medium or reproduced on a suitable playback device. This is trivial 
and hence not shown in fig. 1 . — It is to be remarked that it is not necessary 
with a separate extra cache or buffer memory for the decoder, as there in the 
method for transmission continuously takes place a tuning between the bit 
rate t[c] and the channel bit rate o. Otherwise there will in the client be 
provided a not shown buffer memory for a possible replay after decoding, 
something which will be obvious to persons skilled in the art. 

Fig. 2a shows the concatenated video file O the way it is stored on the HTTP 
server. The file O is as mentioned concatenated ofy streams ISi-ISy and 
comprises in addition a header H4, which is formatted in 4y+l blocks. The 
first block Y gives the number of streams y in the file O. Thereafter follows;; 
blocks T i-Ty which respectively state the average encoding bit rate for the 
respective ;; streams IS. Now follow in the header two blocks Dt, Di for each 
stream IS. These blocks comprise information about the distances and dt 
respectively from the beginning of the file <1> to the beg innin g of each stream 
IS and to the end of the first packet PCi in each stream IS. In addition there 
is for each stream IS a block n which comprises parameters for the stream 
IS, the block II possibly being segmented in several sub-blocks depending oh 
the number of entered parameters. These parameters may e.g. concern a 
frame dimension and parameters for an audio coding. This prevents that 
switching in a bandwidth scaling takes place between streams with different 
Grame dimensions, and if non-bandwidth scalable audio information is 
interfoliated with video information switching between streams which 
contain different audio encoding is prevented. Specifically in fig. 2b the 
block Dic,i gives the distance to the beginning of the stream ISi and the block 
Di,i the distance to end of the first packet PCi in the stream ISi, while Hi 
contains parameter information about the stream ISi. Correspondingly the 
block Dk,y gives the distance to the beginning of the last stream ISy in the file 
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'r P“ket PC. in the stream IS. 

and the block n, then, of course, parameter information about the stream IS,. 

Fig 3 shows how each stream IS is divided into p packets PC.-PC. Each 
packet PC has length g and preferably is the packet length at 

DackeHriT d 

P ®‘*' ® balance between the packet overhead, the 

Zr T”, ” “ “<i as is to be mentioned 

later, the t.me interval between a possible switching of streams IS. If these 

conditions are met. the packet length g may otherwise be variable. 

en^d^rfd ““ * ®°“Prisas the 

The I ” a ““® segment which corresponds to the payload 

The payload PL of each packet PC contains a number n of video frames 

w ich CM be different from packet to packet. Further, it is necessaiy that the 
payload PL starts with an independently decodable video frame such that the 

of a° Z ^ 't”* each packet independently. The header Hpe 

D cTnl f“ * fiaat block 

D. contains mformation about the distance d, to the beginning of the 

° °™”8 packet, a second block d, the mformation about the distance d, to 

ae end of tte following packet, a ttird block N information of tte number n 

of video frames in the packet. *e fourth block B a reference to toe 

corresponding packet in tte encoded intermediate stream ISs wift a closest 

lower average bit rate t[ca..J. and a fifth block A a reference to tte 

coirespondmg packet in fte encoded intermediate stream IS, wi* a closest 

TV* "*® ®*'*® stream 

l^k, and k,b,a e y. 

ta addition the header Hpc of a packet PC can further contain mformation 
which IS used if a method for searching the video information is 
implemented, such this is to be discussed in the following. 

It i^o be understood that the streams IS can be provided randomly in the file 
. ne stre™ IS^ which has the lowest bit rate tfcj will refer only to - 
packets m the stream with the closest higher bit rate and correspondingly the 

stream IS„ with the highest bit rate tfcrf will refer to packet in the stream 
With the closest lower bit rate. 
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It is, however, preferred and suitable that the streams IS are provided 
successively with increasing bit rate t[c] in the file such that the stream ISl 
with the lowest bit rate 1[cl] becomes the first video stream ISi in the file O 
and the stream ISh with the highest bit rate t[CH] the last video stream ISy in 
the file O. Preferably the encoded streams VS;IS contain the same number of 
video fir^es F. This does not prevent that a variable video frame rate may be 
used as it will be possible to define an adjustment frame Fs as a peculiar type 
of frame ("no frame") and adjusting the frame rate in each stream IS to the 
same rate by inserting a suitable number of adjustment frames Fg in the 
respective streams IS. The client will then interpret an adjustment frame Fs 
as a repetition of the preceding decoded video frame. The stream ISb which 
has the closest lowest bit rate t[ck.i] relative to the bit rate t[Ck] of the current 
stream ISk, shall have an independently decodable video frame IF at positions 
corresponding to the beginning of each packet in the current stream ISk- 
Correspondingly shall the stream ISa which has the closest higher bit rate 
t[ck] relative to the bit rate tfcjJ of the current stream ISk also have an 
independently decodable video frame at positions corresponding to the 
beginning of each packet in the current stream ISk. Finally, each stream IS 
will be dependent on the streams with respectively the closest lower bit rate 
and the closest higher bit rate and this entails that the encoded streams must 
be processed in parallel, frame by frame. 

The video file 0) with the decoded streams IS is located on an HTTP server 
which is adapted to a suitable version of HTTP, specifically HTTP version 
1.1. Generally the transmission starts with the client first requesting the 
HTTP server for a fixed number of bytes. This fixed number of bytes shall 
include the header H<|, of the video file O and the start of the first stream VSi 
in the video file <I>. Under certain circumstances it shall, however, not be 
possible to switch between all video streams IS in the file O. If e.g. the video 
streams have different frame dimensions or if the audio information which is 
interfoliated with the video information is not bandwidth-scalable and 
encoded with different values for each video stream, it will not be possible to 
switch between such video streams as they are not bandwidth-scalable. In 
order to prevent switching between video streams which are not 
bandwidth-scalable, the client must interpret the parameters of the header H® 
and decide between which video streams switching can take place. It is, 
however, in some cases only possible to switch between a subset of the total 




wo 02/054284 



PCT/N002/00002 



number;, video streams and the client will then, as mentioned above 

generate subsets of the streams IS, such that switching between the streams 
IS only can take place in the subsets. 

It is necessaiy that the client estimates the channel bit rate <j during the 

5 tosmission. If the estimate of the channel bit rate c is lower than the 
average bit rate t[cj of the current stream ISt. it is first switched to the 

stream ISs with the closest lower average bit rate t[cj. If the estimate ^of 

I »„ switching takes place to the stream IS. with the closest higher 

erage bit rate tfct.,]. If the streams IS are provided successively with 

Is!7Ts”* 7f th " of ^ neighbouring streams 

ISk.i, ISi.. of the current stream ISa switching in case takes place to If the 

current sfream IS^ is the stream with the lowest bit rate t[cj and provided as 

fono*” * r™ 00"se, only be switched to the 

llowing strem ISj. Correspondingly, if the current stream ISt is equal to 

tte stream with the highest bit rate ISh. switching can only take place to the 
closest precedmg stream IS,.,. By switching between the streams the 
encoding bit rate tfc] can be adapted to a current channel bit rate c and it is 
^ us ac leve andwidth-scalable video transmission over e.g. HTTP version 

With reference to fig. 4 there shall now he given a more specific discussion 
on how bandwidth-scalable video transmission is realized by switching 
between the video streams in the video file <P. In fig. 4 arrows indicate the 
distances whereto the distance blocks D,. Dj in a header Hpc refer Likewise 

“dirate^r** corresponding packets in the stream 

indicated by arrows between the blocks. A closer discussion of how the 

foltowi^* ” ** “ow be given in the 

In the example in fig. 4 there are shown six video streams IS,-ISs as 
concatenated and which with the addition of a file header H* form the video 
file O. Each stream IS comprises as shown in fig. 4. three packets and 

tlT7!7h!“ fT °f ** 'erne video information. Both 

the numb« of streams and number of packets in each stream are, of course 

wholly schematic examples and foimatting in streams and packets will in ’ 

reality be adapted to the circumstances which the transmission of the current 



mucry’V'»r>. 
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video information requires. In fig. 4 the streams may e.g. be coded with the 
bit rates t[c] as stated in the above example and thus cover an expected 
channel bit rate range of 20-500 kbits/s. Further are the streams IS in the file 
O in fig. 4 shown provided with increasing bit rates such that the first stream 
5 ISi in the file <I> has the lowest bit rate t[cL] and the last stream ISe in the file 
<1> the highest bit rate t[cH]. 

Generally each stream IS as shown in fig. 4 is formatted into p packets, with 
p = 3 in fig. 4, and each stream hence contains the same video information. 

In the header Hpc of each packet there are provided two blocks which contain 
10 byte- formatted distance information. The block Di states the distance di to 
the beginning of the following packet and the block D 2 the distance d 2 to the 
end of the following packet in the stream. As soon as the two distance blocks 
Di, D 2 are received for a packet, a request is made for the following packet in 
the stream. The sequencing of the packet requests in this way eliminates the 
15 roundtrip delay in the network. The requests for packets terminate when two 
distance blocks Di, D 2 containing the number 0 are received. 

The header Hpc of the packets PC further contains two information blocks 
which are used for switching between streams, namely the information block 
B which refers to the corresponding packet in the stream with the closest 
20 lower bit rate and the information block A which refers to the corresponding 
packet is the stream with the closest higher bit rate. As the streams IS are 
provided in fig. 4, namely with increasing bit rates t[c], the blocks B and A 
respectively shall refer to the neighbour streams of the current stream. As 
shown in fig. 4 the first packet in each of the streams IS does not contain the 
25 blocks A and B. When the current channel bit rate o is estimated, something 
which takes place when the request for the header H<t, in the video file d> is 
made, the blocks T with bit rate information will then indicate which stream 
is desired and the distance blocks and D] shall indicate the distance from 
the beginning of the file O to respectively the beginning of the desired video 
30 stream IS and to the end of the first packet PC] in this video stream. There is, 
however, nothing preventing that the blocks B, A yet could be provided in 
the header in the first packet PCi in each stream IS, but this is, as will be 
seen, a superfluous measure. As the streams, as shown in fig. 4, are provided 
with increasing bit rates, also the stream ISi, i.e. the stream ISl with the 
35 lowest bit rate t[cL], can only refer to the stream with the closest higher bit 
rate and the headers of the packets in the stream ISj will hence, of course. 
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only contain the distance block A which refers to the stream with the closest 
lower bit rate, viz. the stream ISj. Correspondingly the stream IS, which 
corresponds to the stream ISb with the highest bit rate t[cj can only refer to 
fte stream with the closest lower bit rate, viz. the stream IS 5 . Hence the 
eader of the packet PC] of the stream IS« does not contain the A block, but 
only the B block which refers to the stream with the closest lower bit rate It 
may be naentioned that the blocks B, A can only be introduced in positions 
wherein both the video ftame F in the current stream IS^ and the streams 
ISk.,, ISi*. (or ISb, is* in randomly provided streams) are intraframes IF 
Intraframes IF are independently coded video frames. Further must respective 
packets have the same length in the streams IS^+j. 

In order to switch from a current stream to the streams with the closest higher 
or ower bit rates t[cn,], t[ck.,] according to how the channel bit rate o 
varies, there are requested two byte-formatted distance block D„ D, with 
fixed size and which start at the indicated distance in the stream to which 
switching IS desired, and this stream, such as being exemplified in fig. 4 of 
course, will be one of the neighbour streams. At the moment the clients ’ 
receive the blocks D,, Ds the request sequence is broken and as soon as the 
next packet is requested, the request sequence is reintroduced. Hence 
switehing between streams IS only introduces a single roundtrip delay, which 
agam reduces the effective channel bit rate o. It will be possible to use only 
one smgle byte-formatted distance block in each packet, as this block then 
gives the size of the next packet. This would, however, result in two 
ro^dtnp delays during switching between the streams IS. In order to avoid 
rebuffermg during downloading or replay in the client, it is essential that the 
switchmg between the streams takes place quickly. Consequently, two 

distoce blocks Di, Dj are used in each packet even if this increases the 
packet overhead. 

Obviously the overhead of bandwidth-scalability will be reversibly 
proportional to the packet length g, as larger packets result in less packet 
oveAead. However, large packets will increase the interval during which the 
st«tchmg between the streams IS may be performed. In order to avoid the 
rebuffenng durmg downloading or playback in the client, it is important that 
this interval is not too long. Expediently it is in the method for transmission 
according to the present invention chosen to employ packets with a duration 
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of about 15 seconds. This gives a typical packet overhead of 0.20 kilobits/s, 
dependent on the size of the HTTP response and request. 

During the initial connection the client requests a predetermined number of 
bytes. The response shall comprise the header Ho of the video file <D and the 
5 first part of the first stream ISj. During the response the channel bit rate <j is 

estimated. Based on the estimate cr the initial stream in the transmission is 
selected. If the initial stream is not the stream ISi in the video file <D>, the 
initial byte as already received from the stream ISi is discarded. If the initial 
stream is the stream ISi, the next response continues from the place where 
10 the initial request ended. As the stream ISi as shown in the example is the 
stream encoded with the lowest bit rate t[ci,], initial buffering is minimized 
for channels with low bit rate o. 

Concerning the estimation of the channel bit rate it will be obvious to 
persons skilled in the art that this can take place in different ways. 

15 In the present invention the basis for a preferred method for estimating the 
channel bit rate <r is that the video encoder generates the bit stream with 
variable bit rate. Each video stream is encoded with a dete rmin ed average 
target bit rate t[c]. Quantization and the frame rate during the video encoding 
are adjusted by some suitably selected rate control average which ensures 
20 . that the target bit rate is met and in addition that a buffering in the encoder is 
maintained on some predetermined value during the encoding of the whole 
video sequence. On the other hand, the client must buffer a predetermined 
amount of data before the decoding starts. 

In addition the client will also have to buffer decoded data before playback 
25 after digital/analog conversion, but this is as mentioned irrelevant in the 

present case. If the channel bit rate a now is equal to the target bit rate t[c] 
for the video stream IS, rebuffering in the client is avoided before the 
decoding. The channel bit rate can be different for the various clients and 
furthermore fluctuate systematically. In order to avoid rebuffering in the 
30 buffer or cache memory of the client preceding the decoding, the method for 
transmission according to the present invention provides bandwidth scaling 
by switching between streams with different target bit rates t[c]. This 
switching is as mentioned made on basis of estimation of the channel bit rate 

A 

o. The estimator is quite simply a- = x/x, where x is the number of buffered 
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Aebom^ ''™* ^ “ °“>y possible to switch between streams at 

llX xTe foU ** *0 «•* 

J "'hen switching 

between the streams shall take place; ^ 

lenllT fh buffered length less than twice minimum packet 

.argt 

2) If < t[cj and buffer length less than minimum packet length then 
sw, chmg ^es piace to the smearn ISa with the Cose" t lower tS’c b«tate 
a • P“het length it shall be understood the value of e 

hTr n f -‘ho- <«-losed 

for estimation of the channel bit rate o is easy to implement as it is 
based on the buffer length and the packet length. 

dependent on buffer length and packet 
gth may also be used. For instance it will be possible to use the time 

derivative of o as estimator for the channel hit rate a. The estimator ctIs in 
Other words 

conditions: 



^2 . The switching then takes place under the following 



20 



25 



30 



. Wit rl“"® “■* '°- 

Preferably the switching in practice will take place by <r becoming equal to 
Md exceeding a boundary value d<r, such that I a- 1 > Ua- 1. Tbh, 

bit on basis of the selected encoding 

bit rates t[c] and the client buffer capacity in connecion with decoding Due 

to any unequal distance between the average selected encoding bit raTes Jc] 

It IS advantageous to scale A<r independently of the sign of <r and the 
current encoding bit rate tfcj. such that switching between e.g.^x Yearns 

IS, ISs takes place with the use of 2-4+2=l 0 preselected values for A o- . For 
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each of the streams ISl and ISh it will then be assigned one single respective 
boundary value as switching only will take place to the stream with 
respectively the closest higher or the closest lower encoding bit rate. 

Alternatively the estimation of the channel bit rate a also may take place by 
5 repeatedly integrating the channel bit rate <s over succeeding time intervals 
X = ta — tb, where ta is the upper boundary and tb the lower boundary for the 
interval t, and to perform the switching by comparing the integration results 
2i, S 2 for respectively the succeeding time intervals Ti,X 2 such that switching 
to a stream ISa with the closest higher bit rate t[Ca] takes place if £2 - 2] > 0, 
10 and correspondingly to a stream ISb with the closest lower bit rate if 

S 2 - Si < 0. Preferably the switching also in these cases takes place when a 
boundary value AS is reached or exceeded, such that I S 2 - Si | ^ I AS | . The 
boundary value can be predetermined on the same basis as the boundary 

value Act for the time-derivated channel bit rate <r, and advantageously * 

15 scaled in analogy with the scaling of Act . 

Whether the time-derivated or the integration result is used as estimator for 
the channel bit rate, it must in the determination of time intervals and 
boundary values be taken into account that in both the channel bit rate a and 
the encoding bit rate t[c] may appear short, non-periodic and random 
20 fluctuations. Time intervals and boundary values must hence be selected 

significantly larger than respectively the maximum duration and amplitude of 
fluctuations of this kind. 

In connection with the method for transmission of video information 
according to the present invention, it will be desirable that the client can 
25 execute search and retrieval of the transmittable video information which is 
entered in the video file O in form of video streams IS and encoded with 
respectively different bit rates t[c] corresponding to the expected channel bit 
rates o of the client. As the streams IS furthermore are formatted with the 
view that transmission shall take place over a suitable version of HTTP as 
30 transport protocol, the search may use the information that already is present 
in the header Hpc in the packets PC of the packet- formatted video streams IS. 
The video information can comprise sequences of video frames which have a 
mutual semantic relation, in casu video films, but may also be comprised by 
single frames or still frames. It is hence desirable that search and retrieval of 




wo 02/054284 



PCT/N002/00002 



19 



10 



15 



v.deo mformauon are directed towards single frames. On the other hand it is 

hereof shall be very demanding in regard of resources. The method for 
chent-execufed search and retrieval of information according to the invention 

ocalTng^ " *l>a ^Waval takes place by 

ocalrzmg this single frame m a sequence of frames, which can comprise no 

the packet where the single frame is located. In practice this implies that the 
search and retrieval according to the invention are realized as a 

tte*he!r‘'T by providing further information in 

- — - 

As shown m fig. 5a a video file is divided into m groups Gi...G„ and each of 

of f packets PC as shall be explained more 

d ff hTL “ I* ‘o be understood that the number r- may 

iffer be^een the various groups, but it will be quite natural that /• is a fixed 

rrlT "‘"r ^ d >0 ^t^^am then is not a muWpl of ft 

ast ^oup m the stream wili, of course, have a number of packets whit; 

less than the chosen fixed number r. An arbitrarily selected group kem 

h»ce comprises packefr PC^„ pc„ pc,„ Ti,e first following gr^ G^i 

course, comprises corresponding packets PCk+i , PCk+, , pc 
The header H,c in the firs, packet PC,, in an arbitt 
m order to realize the method for search and refrieval additioi eomfrise 
^o information blocks beyond Uiose shown in fig. 3 c, viz. an information 
block Dj and ^ information block J. The infonnation block Dj contains 
nformation about a jump offset dj which corresponds to toe total length of 
the packets m toe group G, and toe information block J contains infotfron 

rst packet PCs*,, m the following group G,^,. It will now be seen that toe 
division m groups G is conventionally given by 4 and not physical. 

In toe method for search and retrieval according to toe invention there is first 
generated a request to the HTTP server for downloading toe he^f 7^0^ 

key The block Dj gives the distance dj to toe beginning of toe packet PC. 

gives toe numbery of video frames ineluded in toe jump offset and in toe first 
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packet PCjc+u in the following group PCjc+i. If now the desired frame is Fx, 
the number x of this frame is compared with y, and ifxej, the transmission 
and decoding of the video stream are resumed from and including the first 
frame in the packets where the frame Fx is located, as this packet must be one 
of the packet in the group and the first packet PCk+i^i in the following 

group Gk+i. The number n of frames in each packet is namely given by the 
block N in the header of the packets and it is then seen that 
j ~ iik,i + Uk+ij, i.e. the sum of all frames in the group G^ and 

the number of frames in the following packet PCk+i^i in the next group Gk+i. 
The packet where frame Fx is located will now be found and the transmission 
and decoding of the video stream then take place from the first frame of this 
packet. 

If now x^J, the header Hpc of the first packet PCt+i^i in the next group Gk+i 
is requested, the header of this packet similarly comprising the block Dj with 
information about the jump offset for the group Gk+i and the block J which 
gives the number of frames / of the packets in this group and the first packet 
Gk+ 2,1 in the next group. If now xe/, the video stream is transmitted and 
decoded from and including the first frame in the packet where the searched 
frame is located and if x^J, the process is repeated on the packets in the 
following group until the desired frame Fx is found. Hence a frame-directed 
packet- formatted search and retrieval of video information are realized with 
the use of a suitable version of HTTP as transport protocol. The search takes 
place in jumps over a sequence of jump offset dj, something which 
contributes to reduce the search time. When the desired video frame Fx is 
located within the stated frame number range J, the search is limited to 
finding the packet which contains the desired frame F* such this will be given 
in the header of the first packet in the group by the blocks Dj and J, as well as 
the block N in the header of each of the packets which is included in the 
group and the first following packet after the group. As the search is limited 
only to finding the packet containing a desired picture, it can be described as 
a pseudorandom search. It is to be remarked that the jump offset dj influences 
both the overhead and the response time of pseudorandom search. The 
overhead in pseudorandom search is namely inversely the proportional to the 
jump offset , as a larger jump offset dj results in a smaller overhead. The 
use of a jump offset of e.g. 25 packets PC, i.e. r = 25, gives typically a 
packet overhead of 0.02 kilobits/s, which is insignificant. The search time is 
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dominated by the roundtrip delay in the network, as no pipelining of the 
request takes place and only a part of the packet header Hpc in fact is 
necessary. It is also seen that the resolution in a pseudorandom search of this 
kind is determined by the packet length q which in its turn is substantially 
given by the number of frames n in each packet. 

The search and retrieval of video information can take place already from the 
initialization of transmission, as it is quite natural to request the header of the 
first packet PCi,i in the first group Gi in the initial video stream IS^. If a part 
of the video stream IS already has been downloaded and decoded, search and 
retrieval can of course start from an incident packet Gk, the sequence of 
frames Fa,...Fp which is located in the client buffer then being frames 
contained in the packets of the current groups. It shall in this connection be 
remarked that the cache memory at most stores a few packets, but the frames 
which are present in the buffer could e.g. very well be from the last packet in 
the group Gfc.i and from the following group Gfc. The information about 
frames and jump offsets dj could then hence appropriately be related to the 
position of respectively the first frame F*, i.e. the frame with the lowest 
number in the buffer, and the last frame Fp, i.e. the frame with the highest 
number in the buffer, as shown in fig. 6a. This shall be explained in more 
detail with reference to the flow diagram in fig. 6b. 

According to the flow diagram in fig. 6b, step 601 asks if x< a. This means 
that the desired frame Fx in a stream IS has a lower number than F^ and in 
step 602 hence the header Hpc of the very first packet PCi,i in the stream is 
requested. Step 603 asks whether x is located among the frames in this packet 
and if the answer is yes, the payload PL of the packet PCjj is requested in 
step 604 and decoding of the payload in step 605, whereafter the frame Fx 
will be found and the process ends. If the answer is no, step 606 asks whether 
X is located among the framesy which are included in the jump offset dj as 
given in the first packet PCi,i and the frames in the following group’s first 
packet PC 2 . 1 . If the answer is yes, the header Hpc of the next packet, i.e. the 
second packet PCi ,2 both in the stream IS and the first group Gi in the stream 
IS is requested in step 607. Step 608 asks if x is found among the frame 
numbers n of this packet, and if the answer is no, an iteration is taking place 
in step 609 and the process is repeated for the next packet PCi „ext in step 
608. Is the answer in each case yes, the payload of this packet is requested in 
step 610 and decoding of the payload in step 611, whereafter the process 
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stops. If the answer in step 606 on the contrary is that x is not contained 'mj, 
the process proceeds to step 612 where the header of the first packet PCnext,i 
in the following group Gnext, i.e. the second group G3 in the stream IS is 
requested. If the answer in step 613 is no to the question whether x is 
5 contained in the frame number which is included in the jump offset dj and the 
frames of the first packet in the following group G„ext+i> in this case i.e. the 
third group in the stream IS, step 614 returns to step 612, as now the first 
packet in the next group G„ext+i, i e. in the present case group G3 etc., is 
requested iteratively. If the answer in step 613 on the contrary is yes, the 
10 header Hpc of the next packet PCnext in the group is requested and if x is not 
found among the frame numbers Wnext of this packet, one returns via step 617 
back to step 615 and the next following packet is requested, whereafter the 
process continues. If the answer in step 616 on the contrary is yes, the 
payload of the packet in question requested and in step 619 the payload is 
15 decoded, whereafter the process stops. 

If the answer in step 60 1 is no, the process proceeds to step 62 1 which asks if 
X is greater than the number /?of the last frame Fp in the buffer. If the answer 
is no, this means that the desired frame Fx is to be found among the frames in 
the buffer and hence immediately will be decoded, wherefore the process 
20 stops. If the answer on the contrary is yes, i.e. that the number x of the 

desired frame Fx is greater than the number of Fp, the header of the following 
packet PCk.y is requested in step 622 and step 623 askes if x is found among 
the frame numbers of this packet. If the answer is yes, the payload PLk,y of 
the packet PCk,y is requested in step 624 and in step 625 the payload PLk,y is 
25 decoded, whereafter the process stops. If the answer on the contrary is no in 
step 622, step 626 asks whether the block J is to be found in the header Hpc 
of the packet PCk,y. If the answer is no, one returns via step 627 to step 622 
and now the header Hpc of the next following packet PCk,y+i is requested, 
whereafter the process continues. If the answer in step 626 on the contrary is 
30 yes, step 628 asks whether the number x is contained in/ which gives the 
frame numbers included in the jump offset dj and the payload of the first 
packet PCk+1.1 in a following group Gk+i- If the answer in step 628 is yes, the 
header Hpc of the following packet PCk,y+i is requested in step 629 and in 
step 630 it is asked whether x is to be found among the frame numbers of this 
35 packet. If this is not the case, an iteration takes place via step 63 1 and the 

header of the next following packet is requested, whereafter the process 
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continues. If the answer in step 630 is yes, the payload PL k,y+i of the packet 
PC k,y+i is requested in step 632 and in step 633 the payload is decoded, 
whereafter the process ends. If the answer in step 628 is no, the header Hpc 
of the first packet PC k+i.i in the following group Gk+i is requested in step 
5 634 and step 635 asks whether x is contained among the frame numbers 

included in the jump offset dj for the group Gk+, and the first packet PCk+,.i 
in the following group Gk+ 2 - If the answer is no, an iteration takes place in 
step 636 back to step 634 and now the header of the first packet in the next 
following group is requested whereafter the process continues. If the answer 
10 in step 635 is yes, the header Hpc of the next packet PC k+y.next, i.e. the second 
packet in the group Gk+y is requested in step 637 and if x is not found among 
the ftame numbers of this packet, one continues via step 639 to the next 
packet etc., whereafter the process continues. If the answer in step 638 is yes, 
the payload PLk+i,next of the packet PCk+i,next is requested in step 640 and 
15 decoding of the payload in step 641, whereafter the process stops. 

In the manner disclosed herein it becomes possible to find a desired frame 
by finding and decoding the packet which contains F^. The search for F* can* 
be initiated during the transmission of a stream as shown in fig. 6b, but can 
also be initiated already when the transmission of the stream itself starts. It is 
20 understood that it will be possible to switch between the streams IS during 

the search as the channel bit rate a varies using the method for transmission 
according to the invention. 

In connection both with the transmission and search it, a stack list for HTTP 
requests is generated and maintained. A request may then be placed in the 
25 stack list when it is sent from the client and it is removed therefrom only 
when the received response from the HTTP server has been processed. If a 
discoimection takes place between HTTP server and client during the 
transmission of video information and reception of the response, the request 
in the stack list is sent anew when the connection has been reestablished. If a 
30 disconnection takes place in connection with the reception of the packet, the 
first request in the list can be updated and the beginning of the packet is then 
adjusted for the data that may already have been received. 
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CLAIMS 



5 



10 



15 



20 



25 



30 



1 . A method in transmission on request of video information in a shared 

network resource, wherein the shared network resource particularly is 
Internet, an intranet or extranet, wherein the video information is stored in 
the form of an encoded video file on HTTP seryers in the shared network 
resource and accessed by clients via HTTP (Hypertext Transfer Protocol), 
wherein each client has a video decoder, and wherein the initial video 
information is in the form of digitized video signals, characterized by 
comprising steps for 

generating by means of a video encoder y multiple encoded video streams 
(VS) which each comprises the video signals of the initial video information 
compressed with average bit rates t[c] which covers the clients’ expected 
channel bit rates ct, the video encoder generating independently decodable 
video frames at given time intervals; 

generating y encoded intermediate streams (IS) from the corresponding 
encoded video streams (VS) by dividing an encoded video stream inp 
packets (PC) with varying lengths q, each packet (PC) comprising a header 
(Hpc) and a payload (PL) which contains the encoded video signals for a time 
segment corresponding to the payload; 
providing the header (Hpc) with the following information: 

(i) the distances dj and respectively from the beginning and to the end of 
the nearest following packet, 

(ii) the number « of video frames (F) that the packet comprises, and 

(iii) a reference to the corresponding packet in respectively the encoded 
intermediate stream (ISb) with a closest lower average bit rate t[cjc.i] and the 
coded intermediate stream (ISa) with a closest higher average bit rate t[Ck+i], 
t[ck] being the bit rate for the present intermediate stream (ISk) and k, b, a g 

p; 

providing an independently decodable video frame (IF) at the beginning of 
the payload (PL) of a packet (PC); 

concatenating the intermediate streams (IS) into a final file (O) which is 
stored on one or more HTTP servers; and providing the final file (<t>) with a 
header (H<i,) which contains information about the parameters of the streams 
(IS); and further by the following steps effected by the client: 
generating a request for the header (H«,) and the beginning of the first stream 
(ISi) in the final file (O); 



35 
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estimating the channel bit rate o and selecting the stream (ISO whose bit rate 

t[Ck] is the closest bit rate relative to the estimated channel bit rate a , as the 
initial stream of the transmission, such that tfcj » a, and then estimating the 

channel bit rate a during the transmission and, if the estimate cr is lower 

than the average bit rate t[cO of the current stream (ISO, 

switching to the stream (ISb) with the closest lower average bit rate t[c|J or, 

A 

if the estimate cr is higher than the average bit rate t[c J of the current 
stream (ISO, 

switching to the stream (ISa) with the closest higher average bit rate t[ck+i], 
the switching of the streams taking place on the basis of the packet 
references and realizing a bandwidth-scaleable video transmission over a 
version of HTTP which allows persistent connection and specification of 
byte range. 



2. A method according to claim 1, 

characterized by estimating the channel bit rate a on the basis of an estimator 

^ ^ A 

wherein cr is the estimated channel bit rate and x the number of 

buffered bits in the time interval r, such that if t[cj and a buffer length 
less than twice the minimum packet length, switching takes place to the 

stream with a closest higher target bit rate t[cj, or, if a,< t[cjJ and the 
buffer length less than minimum packet length, switching takes place to the 
stream with a closest lower target bit rate t[Cb]. 



25 



3 . A method according to claim 1 , 

characterized by estimating the channel bit rate a on the basis of an estimator 
~ if cr> 0, switching fakes place to the stream with a 



closest higher target bit rate t[Ca], or, if cr < 0, switching takes place to the 
stream with the closest lower target bit rate t[cb]. 



4. A method according to claim 3, 

characterized by determining a boundary value Act, such that switching takes 
place if I cr I > | Act | , 
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5. A method according to claim 1, 

characterized by estimating the channel bit rate a by repeatedly integrating 
the channel bit rate a over succeeding time intervals x = t^ — tb, ta and tb 
being respectively an upper and lower boundary for the interval t and an 

5 integration result S is given by 2 = Jcr dt, and comparing the integration 

■»6 

results S 2 , Si for respective succeeding time intervals X 2 , xi, such that if S 2 - 
Sj > 0, switching takes place to the stream with a closest higher target bit 
rate t[Ca], or, if S 2 - Si < 0, switching takes place to stream with a closest 
lower target bit rate t[cb]. 

10 6. A method according to claim 5, 

characterized by determining a boundary value AS, such that switching takes 
place if I S 2 - Si I ^ I AS | . 

7. A method according to claim 1, 

characterized by using a HTTP server adapted to HTTP version 1.1. 

15 8. A method according to claim 7, 

characterized by the video transmission taking place over HTTP version 1.1. 

9. A method according to claim 1, 

characterized by providing the packets (PC) of the streams (IS) in 
non-overlapping successive groups (G) of two or more successive packets, 

20 such that each stream (IS) is divided into m groups of this kind, and 

providing the header of the first packet (PCk,i) in a group (G^, kem) with 
information of a jump offset dj which corresponds to the combined lengths of 
the packets in the group (Gk) and the numbery of frames (F) which is covered 
by the jump offset dj and the first packet (PCk+ 1 . 1 ) in the following group 
25 (Gk+i). 

10. A method according to claim 1, 

characterized by providing the streams (IS) at random in the final file (<I>), 
and by the stream (ISJ with the lowest bit rate t[cL] referring only to packets 
of the stream with the closest higher bit rate and the stream (ISh) with the 
30 highest bit rate t[cH] referring only to packets in the stream with the closest 
lower bit rate. 
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11. A method according to claim 1, 

characterized by providing the streams (IS) successively with the increasing 
bit rate t[c] in the final file (O), such that the stream (ISl) with the lowest bit 
rate ttcj is the first video stream (ISO of the file (O) and the stream (ISh) 
with the highest bit rate t[cH] is the last video stream (ISy) of the file (O). 

12. A method according to cl aim 

characterized by the streams (ISb) and (ISg) containing an independently 
decodable video frame (IF) at positions corresponding to the beginning of 
each packet (PC) in the current stream (ISk). 

13. A method according to claim 1, 

characterized by the decoded streams (VSjIS) containing the same number of 
video fiames (F). 

14. A method according to claim 1, wherein the streams (IS) have 
different frame rates, 

characterized by defining an adjustment frame (Fg) as a particular frame type 
and adjusting the frame rates of each stream (IS) to the same rate by inserting 
a suitable number of adjustment frames in the respective streams. 

15. A method according to cl aim 1, 

characterized by the packet length q at most being equal to a configurable 
request time-out interval for a HTTP server. 

16. A method according to claim 1, 

characterized by processing the encoded streams (IS) in parallel frame by 
frame. 

17. A method according to claim 1, 

characterized by the header (H^) of the final file (O) comprising information 
of the number y streams (IS) of the file, the average bit rate t[c] of each 
stream and the distances <4 and di from the beginning of the final file to 
respectively the beginning of each stream and to the end of the first packet 
(PCi) in each stream (IS). 

18. A method according to cl aim 1, 

characterized by the client on the basis of the parameters of the video streams 
generating subsets of the streams (IS), the switching between the streams (IS) 
only taking place in these subsets. 




wo 02/054284 



PCT/N002/00002 



28 

19. A method according to claim 1, 

characterized by generating and maintaining a stack list for HTTP requests, a 
request being placed in the stack list at transmission and removed therefrom 
after processing the response received from the HTTP server, and by the 
5 requests in the stack list being retransmitted at reestablished connection after 
a possible disconnection between the HTTP server and client during 
reception of the response. 

20. A method according to claim 19, 

characterized by updating the first request if a disconnection takes place 
10 during reception of a packet (PC), the beginning of the packet (PC) being 

adjusted for the possibly already received data. 

21 . A method for client-executed search and retrieval of video information 
in a shared network resource, particularly search and retrieval of a desired 
frame (Fx) in a video stream, wherein the shared network resource 

15 particularly is Internet, an intranet or extranet, wherein the video information 
is stored in form of an encoded video file on HTTP servers in the shared 
network resource and accessed by clients via HTTP (Hypertext Transfer 
Protocol), wherein each client has a video decoder, wherein the encoded 
video file (O) is concatenated of multiple encoded video streams (IS) which 
20 contain the video signals of the video information compressed at an average 
bit rate t[c] which covers the client’s expected channel bit rate a, wherein 
each encoded video stream is divided into p packets (PC) with varying 
lengths q, wherein each packet (PC) comprises a header (Hp) and payload 
(PL), wherein the packets (PC) in a stream (IS) are provided in 
25 non-overlapping, successive groups (G) of two or more successive packets, 
such that each stream (IS) is divided in m groups of this kind, wherein the 
header (Hpc) of the first packet (PCk.i) of each group (Gr, kern) in addition to 
information of the number n of video ftames which the packet contains and 
references to other packets and streams, is provided with information of a 
30 jump offset dj which corresponds to the combined lengths of the packets in 
the group and number J frames (F) which the jump offset dj and the first 
following packet (PC^h, i) in the following group (Gk+0 comprise, wherein 
the video file (<I>) further comprises a header (Ho) which contains 
information of the parameters of the streams (IS), wherein the information of 
35 the parameters of the streams (IS) includes the distances d^ and di from the 

beginning of the video file (O) to respectively the beginning of each stream 
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(IS) and to the end of the first packet (PCj) in each stream (IS), where in the 
transmission of video information takes place bandwidth-scalable over a 
version of HTTP which allows persistent connection and specification of 
byte range, and wherein the method is 

characterized by generating a request to the HTTP server for downloading 
the header (Hp^) of the first packet (PC^^i) in a group (Gk, kem) in a video 
stream (ISk, key), comparing the number x of the desired frame (Fx) with y, 
and if xey, continuing the transmission and decoding of the video stream 
from and including the first frame of the packet wherein the frame (Fx) is 
located, this packet being one of the packets in the group (Gk) and the first 
packet (PCk+i,i) in the following group (Gk+i), or, if xgy, requesting the 
header (Hpc) of the first packet (PCk+i,i) in the following group (Gk+i), and, 
as is the case, continuing the process until the desired frame (Fx) has been 
found, whereafter downloading and decoding of the stream (IS) are continued 
from and including the first frame in the packet where the desired frame (Fx) 
is located, such that a frame-based and packet-formatted search and retrieval 
of video information are realized with the use of a suitable version of HTTP 
as a transport protocol. 

22. A method according to claim 21, 

characterized by generating and maintaining a stack list for HTTP requests, a 
request being placed in the stack list at transmission and removed therefrom 
after processing the response received from the HTTP server, and by the 
requests in the stack list being retransmitted at reestablished connection after 
a possible disconnection between the HTTP server and client during the 
reception of the response. 

23. A method according to claim 22, 

characterized by updating the first request if a disconnection takes place 
during reception of a packet (PC), the beginning of the packet (PC) being 
adjusted for the possibly already received data. 

24. A method according to claim 21, 

characterized by using a HTTP server adapted to HTTP version 1.1 and that 
HTTP version 1.1 is used as a transport protocol. 
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